DISTA FIN 03 2006 



The art of fitting financial time series with Levy stable 

distributions 

Enrico Scalas 
Department of Advanced Sciences and Technology, 
Laboratory on Complex Systems, East Piedmont University, 
Via Bellini 25 g, 1-15100 Alessandria, Italy* 

Kyungsik Kim 

Department of Physics, Pukyong National University, Pusan 608-737, Korea 

(Dated: February 2, 2008) 

Abstract 

This paper illustrates a procedure for fitting financial data with o-stable distributions. After 
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I. INTRODUCTION 



There are several stochastic models available to fit financial price or index time series 
{Si}^ [1]. Building on the ideas presented by Bachelier in his thesis [2, 3], in a seminal paper 
published in 1963, Mandelbrot proposed the Levy a-stable distribution as a suitable model 
for price differences, £ = Si + i — Si, or logarithmic returns, £i og = log(<Si+i) — log(jS'j) [4]. In the 
financial literature, the debate on the Levy a-stable model focused on the infinite variance 
of the distribution, leading to the introduction of subordinated models [5-7] ; in the physical 
literature, Mantegna used the model for the empirical analysis of historical stock-exchange 
indices [8]. Later, Mantegna and Stanley proposed a "truncated" Levy distribution [9-11], 
an instance of the so-called KoBoL (Koponen, Boyarchenko and Levendorskii) distributions 
[12]. 

Levy a-stable distributions are characterized by a power-law decay with index < a < 2. 
Fitting the tails of an empirical distribution with a power law is not simple at all. Weron has 
shown that some popular methods, such as the log-log linear regression and Hill's estimator, 
give biased results and overestimate the tail exponent for deviates taken from an a-stable 
distribution [13]. 

In this paper, a method is proposed for fitting financial log-return time series with 
Levy a-stable distributions. It uses the program stable.exe developed by Nolan [14] 
and the Chambers-Mallow-Stuck algorithm for the generation of Levy a-stable deviates 
[15-17]. The datasets are: the daily adjusted close for the DJIA index taken from 
http://finance.yahoo.com and the daily adjusted close for the MIB30 index taken from 
http://it.finance.yahoo.com both for the period 1 January 2000 - 3 August 2006. The 
two datasets and the program stable.exe are freely available, so that whoever can repro- 
duce the results reported below and use the method on other datasets. 

The definition of Levy a-stable distributions is presented in Section II. Section III is 
devoted to the results of the empirical analysis. A critical discussion of these results can be 
found in Section IV. 
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II. THEORY 



A random variable S is stable or stable in the broad sense if, given two independent copies 
of S, Si and S 2 , and any positive constant a and b, there exist some positive c and some 
real d such that the sum aSi + bE 2 has the same distribution as cS + d. If this property 
holds with d — for any a and b then S is called strictly stable or stable in the narrow 
sense. This definition is equivalent to the following one which relates the stable property to 
convolutions of distributions and to the generalization of the central limit theorem [18]: A 
random variable S is stable if and only if, for all n > 1 independent and identical copies of 
S, Si, ... , S n , there exist a positive constant c n and a real constant d n such that the sum 
Si + . . . + S n has the same distribution as c n S + d n . It turns out that a random variable S 
is stable if and only if it has the same distribution as aZ + 6, where < a < 2, — 1 < (3 < 1, 
a > 0, b G K. and Z is a random variable with the following characteristic function: 



(exp(mZ)) = exp ^— 1 — i/3tan j sign(/t) j if a ^ 1 



and 



(exp(i/cZ)) = exp ( — |/c 



1 + i/3 ( - ) sign(/c) log(|/c| 



if a = 1. 



(1) 



(2) 



Thus, four parameters (a, b, a, f3) are needed to specify a general stable distribution. 
Unfortunately, the parameterization is not unique and this has caused several errors [19]. In 
this paper, a parameterization is used, due to Samorodnitsky and Taqqu [20]: 



(exp(mS)) = exp ( — 7 a |/t| a 



i(3 tan J sign(/c) + 2<5/tJ if a ^ 



and 



(exp(mS)) = exp I — 7|/c 



7T 



l + ( - ) sign(re) log(|/c|) 



+ i8k ] if a = 1. 



(3) 



(4) 



This parameterization is called SI in stable . exe. The program uses a different parameter- 
ization (called SO) for numerical calculations: 

(exp(mS )) = exp 7 a |/t| a 1 — i/5 tan ^— J sign(/t) + i 5o — [3t&ny— j kj if a ^ 1 

(5) 

and 



(exp(mSo)) = exp ( -7|/c 



l + ( - J sign(re) log(|/c|) 









+ i 


^o-/3 0^ 7log7 


•) 









(6) 



TABLE I: Mean, variance, skewness, and kurtosis of the two log-return time series. 



Index 


Mean 


Variance 


Skewness 


Kurtosis 


MIB30 


7.3 • 10" 5 


1.6 • 10~ 4 


0.22 


6.7 


DJIA 


6.1 • 1(T 6 


1.3 • 1(T 4 


0.038 


6.6 



The above equations are modified versions of Zolotarev's parameterizations [21]. Notice that 
in Eqs. (3) and (4), the scale 7 is positive and the location parameter 5 has values in R. 

III. RESULTS 

Daily values for the MIB30 and DJIA adjusted close have been downloaded from 
http://it.finance.yahoo.com and http://finance.yahoo.com, respectively, for the pe- 
riod 1 January 2000 - 3 August 2006 [22]. The MIB30 is an index comprising 30 "blue chip" 
shares on the Italian Stock Exchange, whereas the DJIA is a weighted average of the prices 
of 30 industrial companies that are representative of the market as a whole. Notice that 
the MIB30 includes non-industrial shares. Therefore, the two indices cannot be used for 
comparisons on the behaviour of the industrial sector in Italy and in the USA. Moreover, 
the composition of indices varies with time, making it problematic to compare different 
historical periods. However, the following analysis concerns the statistical properties of the 
two indices, considered representative of the stock-exchange average trends. The datasets 
are presented in Figs 1-4. Figs. 1 and 2 report the index value as a function of trading 
day. There are 1709 points in the MIB30 dataset and 1656 points in the DJIA dataset. 
Correpondingly, there are 1708 MIB30 log-returns and 1655 DJIA log-returns. They are 
represented in Figs. 3 and 4, respectively. The intermittent behaviour typical of log-return 
time series can be already detected by eye inspection of Figs. 3 and 4, but this property will 
not be further studied. 

In Table I, the mean, variance, skewness, and kurtosis are reported for both the MIB30 
and the DJIA log-return time series. 

The two log-return time series were given as input to stable.exe. The program imple- 
ments three methods for the estimate of the four parameters of Eqs. (3) and (4). The first 
one is based on a maximum likelihood (ML) estimator [14, 23]. The second method uses 
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FIG. 1: MIB30: Adjusted close time series. 

tabulated quantiles (QB) of Levy a-stable distributions [24] and it is restricted to a > 0.6. 
Finally, in the third method, a regression on the sample characteristic (SC) function is used 
[25, 26]. 

In Table II, the estimated values of a, (3, 7, and 5 are reported. The estimates were 
obtained with the standard settings of stable.exe. In order to preliminary assess the 
quality of the fits, three synthetic series of log-returns for each index were generated with the 
Chambers-Mallow-Stuck algorithm. The empirical complementary cumulative distribution 
function (CCDF) for absolute log-returns was compared with the simulated CCDFs, see 
Figs. 5 and 6. In both cases, the fit based on the SC function turned out to be the best 
one. A refinement of the ML method gave the same values for the four parameters as the 
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FIG. 2: DJIA: Adjusted close time series. 



SC algorithm. Therefore, the SC result was selected as the null hypothesis for two standard 
quantitative goodness-of-fit tests: The one-sided Kolmogorov-Smirnov (KS) test and the \ 2 
test. 

For the KS test, the range of MIB30 log-returns, (—0.0777, 0.0811), was divided into 
1654 intervals of width 9.6 x 10~ 5 . Then the number of points lying in each interval was 
counted and partially summed starting from —0.0777, leading to an estimate of the empirical 
cumulative distribution function (CDF). The same procedure was used for DJIA log-returns. 
In this case the range was (—0.0615, 0.0740), the number of intervals 1693, and their width 
8.0 x 10~ 5 . In Figs. 7 and 8, the empirical CDF is plotted together with the theoretical 
CDF obtained from the fit based on the SC function. For large sample sizes, the one-sided 



6 




-0.08 



600 



800 1000 
trading days 



1600 1800 



FIG. 3: MIB30: Log-return time series. 

KS parameter, D, is approximately given by: 

D = max(|CDFi - CDFth;|), 

where CDF; and CDFthj are, respectively, the empirical and the theoretical values corre- 
sponding to the i— th bin; at 5% significance, D can be compared with the limiting value 
d = 1.36/ y/N, where N is the number of empirical CDF points. For MIB30 log-returns 
D = 0.0387 and d = 0.0334, whereas for DJIA log-returns D = 0.0232 and d = 0.0330. 
Therefore, the null hypothesis of a-stable distributed log-returns can be rejected for the 
MIB30, but not for the DJIA data. 

For the x 2 test, the range of MIB30 and DJIA log-returns was divided into 30 equal 
intervals. Then, the observed Oj and expected Ei number of points lying in each interval 
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FIG. 4: DJIA: Log-return time series. 

were evaluated; these data are plotted in Figs. 9 and 10. After aggregating the intervals 
with Ei < 5, x 2 was obtained from the formula: 

(O t -E t ) 2 



X 



Ei 



The number of degrees of freedom is given by the number of intervals where Ei > 5 minus 
5 (4 estimated parameters and the normalization). For MIB30 data, x 2 — 91-5 with 10 
degrees of freedom. The probability that x 2 > X 2 is 0. For the DJIA time series, x 2 — 26.8 
with 11 degrees of freedom. The probability that x 2 > X 2 is 0.005. Again, for DJIA data, 
even if at a low significance level, the null hypothesis may be accepted. 
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TABLE II: Estimated parameters of the a-stable distribution. 



Method 


a 


P 




7 


5 


MIB30 


ML 


1.57 


0.159 


6.76 


• 10~ 3 


3.50 • 10~ 4 


QB 


1.42 


0.108 


6.23 


• 10~ 3 


5.43 • 10~ 4 


SC 


1.72 


0.263 


7.06 


• 10" 3 


2.53 • 10~ 4 


DJIA 


ML 


1.73 


0.014 


6.60 


• 10^3 


4.43 • 10~ 5 


QB 


1.60 


-0.004 


6.21 


• 10^3 


2.69 • 10~ 4 


SC 


1.81 


0.129 


6.69 


• ID" 3 


1.79 • 10~ 4 



IV. CONCLUSIONS AND OUTLOOK 



This paper illustrates a procedure for fitting financial data with a-stable distributions. 
The first step is to use all the available methods to evaluate the four parameters a, (3, 7, 
and 5. Then, one can qualitatively select the best estimate and run some goodness-of-fit 
tests, in order to quantitatively assess the quality of the fit. 

The main conclusion of this paper is that, for the investigated data sets, an a-stable fit 
is not so bad; the best parameter estimate is obtained with a method based on a sample 
characteristic function fit. Incidentally, the tail index, a, is 1.72 for MIB30 and 1.81 for 
DJIA. These values are consistent with previous results [27] and with remarks made by 
Weron [13]. The performance in two standard goodness-of-fit tests (KS and x 2 ) is better for 
DJIA data. 

The two hypothesis tests used in this paper have some limitations. For instance, the 
KS test is more sensitive to the central part of the distribution and underestimates the tail 
contribution. For this reason, it would be better to use the Anderson-Darling (AD) test 
[28] . However, a standardized AD test is not available for a-stable distributions. Moreover, 
the value of x 2 m the x 2 test is sensitive to the choice of intervals, and a detailed analysis 
on this dependence would be necessary. 

Given a set of financial log-returns, is it possible to find the best-fitting distribution? 
In general this question is ill-posed. As mentioned in the introduction, there are several 
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absolute log-returns 



FIG. 5: MIB30: Comparison of CCDFs for absolute log-returns. Circles: Empirical data. Dotted 
line: Maximum-likelihood fit. Dashed line: Quantile based fit. Solid line: Fit based on the SC 
function. 

possible competing distributions that can give very similar results in the interval of interest. 
Moreover, depending on the specific criterion chosen, different distributions may turn out 
to be the best according to that criterion. Therefore, if there is no theory suggesting the 
choice of a specific distribution, it is advisable to use a pragmatic and heuristic approach, 
application-oriented. For example, Figs. 9 and 10 show that the Levy a-stable fit discussed 
in this paper tends to underestimate the tails of the probability density function (PDF) in 
the two investigated cases. In risk asessment procedures, such as value at risk estimates, 
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absolute log-returns 



FIG. 6: DJIA: Comparison of CCDFs for absolute log-returns. Circles: Empirical data. Dotted 
line: Maximum-likelihood fit. Dashed line: Quantile based fit. Solid line: Fit based on the SC 
function. 

this may be an undesirable feature, and it could be wiser to look for some other probability 
density whose the PDF prudentially overestimates the tail region. 
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FIG. 7: MIB30: Comparison of CDFs for log-returns. Circles: Empirical data. Solid line: Fit 
based on the SC function. 
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